Message-passing code generation for non-rectangular tiling transformations

نویسندگان

  • Georgios I. Goumas
  • Nikolaos Drosinos
  • Maria Athanasaki
  • Nectarios Koziris
چکیده

Tiling is a well known loop transformation used to reduce communication overhead in distributed memory machines. Although a lot of theoretical research has been done concerning the selection of proper tile shapes that reduce processor idle times, there is no complete approach to automatically parallelize non-rectangularly tiled iteration spaces and consequently there are no actual experimental results to verify previous theoretical work on the effect of the tile shape on the overall completion time of a tiled algorithm. This paper presents a complete end-to-end framework to generate automatic message-passing code for tiled iteration spaces. It considers general parallelepiped tiling transformations and convex iteration spaces. We aim to address all problems concerning data parallel code generation efficiently by transforming the initial non-rectangular tile to a rectangular one. In this way, data distribution and the respective communication pattern become simple and straightforward. We have implemented our parallelizing techniques in a tool which automatically generates MPI code and run several benchmarks on a cluster of PCs. Our experimental results show the merit of general parallelepiped tiling transformations, and verify previous theoretical work on scheduling-optimal, non-rectangular tile shapes. 2006 Elsevier B.V. All rights reserved.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Compiling Tiled Iteration Spaces for Clusters

This paper presents a complete end-to-end framework to generate automatic message-passing code for tiled iteration spaces. It considers general parallelepiped tiling transformations and general convex iteration spaces. We aim to address all problems concerning data parallel code generation efficiently by transforming the initial non-rectangular tile to a rectangular one. In this way, data distr...

متن کامل

Generating efficient tiled code for distributed memory machines

Abstract — Tiling can improve the performance of nested loops on distributed memory machines by exploiting coarse-grain parallelism and reducing communication overhead and frequency. Tiling calls for a compilation approach that performs first computation distribution and then data distribution, both possibly on a skewed iteration space. This paper presents a suite of compiler techniques for gen...

متن کامل

Delivering High Performance to Parallel Applications Using Advanced Scheduling

This paper presents a complete framework for the parallelization of nested loops by applying tiling transformation and automatically generating MPI code allowing for an advanced scheduling scheme. In particular, under advanced scheduling scheme we consider two separate techniques: first, the application of a suitable tiling transformation, and second the overlapping of computation and communica...

متن کامل

Code Generation Methods for Tiling Transformations

Tiling or supernode transformation has been widely used to improve locality in multi-level memory hierarchies, as well as to efficiently execute loops onto parallel architectures. However, automatic code generation for tiled loops can be a very complex compiler work due to non-rectangular tile shapes and arbitrary iteration space bounds. In this paper, we first survey code generation methods fo...

متن کامل

An Efficient Code Generation Technique for Tiled Iteration Spaces

This paper presents a novel approach for the problem of generating tiled code for nested for-loops, transformed by a tiling transformation. Tiling or supernode transformation has been widely used to improve locality in multi-level memory hierarchies, as well as to efficiently execute loops onto parallel architectures. However, automatic code generation for tiled loops can be a very complex comp...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Parallel Computing

دوره 32  شماره 

صفحات  -

تاریخ انتشار 2006